Blog

New definition of open source AI is “flawed”, experts say

October 8, 2024

2 minutes read

A new definition for ‘Open Source AI’ has been launched by the Open Source Initiative (OSI), though experts have told ITPro that the terms lack nuance and may be under the wrong management.

The new definition has AI training data as its focus, clarifying that training data must be shared and disclosed. It also states that code must be complete to the extent that recipients understand how the training was done.

“In my opinion, OSI continues with its flawed ‘one size fits all’ approach rather than helping to better define the ‘spectrum’ for open source and AI.” Peter Zaitsev, founder of Percona, told ITPro.

The OSI’s new definition considers four types when it comes to training data – open, public, obtainable, or unshareable. It noted that, while the legal requirements are different for each, all must be shared in a form allowed by law to adhere to the new terms.

It highlights two key features, the first of which demands that the code used to train and process data in AI development must be complete to the extent that open source recipients understand how the training was done.

Training is where the innovation is taking place, the OSI said, so transparency around the code used in training is necessary to allow open source users to study and modify AI systems.

Another feature of the definition acknowledges that requirements of ‘copyleft-like terms’ are admissible. This is where the training code and a dataset are bundled together in a legal sense. The OSI’s new definition is at the ‘release candidate’ stage, meaning no new features will be added going forward, only bug fixes.

A contentious definition

Zaitsev takes issue with several terms in the definition, particularly the triaging of data types into obtainable and unshareable. In a linked FAQ, the OSI clarified that obtainable data can be revealed for a cost while unshareable data can only be revealed in the form of a detailed description.

“While it makes a lot of difference for actual users, if training data is not freely available for everyone, it is not the same as ‘open source’,” Zaitsev said.

“I think it would make sense for OSI to lead the effort to properly define the standard classification for these free-to-use models which, in my opinion, and in particular due to massive training costs, is where potential value for competition will be massive,” he added.

Amanda Brock, CEO of OpenUK, told ITPro that the issues here are even broader and that the problem is more deeply entrenched in the OSI’s position as an institution.

“This is not only concern about the content of any definition, but whether there should be an ‘open source AI definition,’ and, if there is one, whether the OSI is the right organization to create it and whether the broader open source software community support its changed role as the custodian of two definitions,” Brock said.

“The OSI’s stated purpose is around open source software – yes, advocating for open source principles is a part of that purpose, but it’s questionable whether managing a whole new definition in AI falls under that purpose,” she added.

Brock’s thinking is that the OSI should maintain its focus on open source software, more than enough work for one small organization to manage, in her opinion. The OSI’s role as a guardian of the Open Source Definition (OSD) is critical, she added.

“The open source software community is being split and fractured by the new Open Source AI definition,” Brock said.

Source link

New definition of open source AI is “flawed”, experts say

Grow a Garden Mega Summer Harvest Event: All Rewards and How it Works

How to get started – Computerworld

Sweet Bonanza Slot by Pragmatic Play Graphics and Sound.79

Live Oak Bank Review: Services, Rates, and More

ZSA Moonlander review: The most customizable keyboard I’ve ever used

NYT Strands Today: Hints and Answers for February 27, 2025

Pin Up Casino — сделай ставку и стань победителем в Пин Ап Казино Онлайн.719

I tested a bunch of gaming laptops and these are the best

YouTube Is Getting AI Overviews Too, and You Can Try Them Now

Today’s AI models have a poor grasp of world history – Computerworld

FBI disrupts the Dispossessor ransomware operation, seizes servers

Wordle Answer for Today, August 13, 2024

Related Articles